NOTICE:  WHEN  GOVERNMENT  OR  OTHER  DRAWINGS,  SPECIFICATIONS  OR  OTHER  DATA 
ARE  USED  FOR  ANY  PURPOSE  OTHER  THAN  IN  CONNECTION  WITH  A DEFINITELY  RELATED 
GOVERNMENT  PROCUREMENT  OPERATION,  THE  U.  S.  GOVERNMENT  THEREBY  INCURS 
NO  RESPONSIBILITY,  NOR  ANY  OBLIGATION  WHATSOEVER;  AND  THE  FACT  THAT  THE 
GOVERNMENT  MAY  HAVE  FORMULATED,  FURNISHED,  OR  IN  ANY  WAY  SUPPLIED  THE 
SAID  DRAWINGS,  SPECIFICATIONS,  OR  OTHER  DATA  IS  NOT  TO  BE  REGARDED  BY 
IMPLICATION  OR  OTHERWISE  AS  IN  ANY  MANNER  LICENSING  THE  HOLDER  OR  ANY  OTHER 
PERSON  OR  CORPORATION,  OR  CONVEYING  ANY  RIGHTS  OR  PERMISSION  TO  MANUFACTURE, 
USE  OR  SELL  ANY  PATENTED  INVENTION  THAT  MAY  IN  ANY  WAY  BE  RELATED  THERETO. 

Reproduced  by 

DOCUMENT  SERVICE  CENTER 


UNCLASSIFIED 


THE  ESTIMATION  OF  BIOLOGICAL  POPULATIONS 


By 

Douglas  G.  Chapman 
University  of  Washington 


Technical  Report  Ho*  12 
August  15,  1953 


Contract  N8onr-520  Task  Order  II 
Project  Sunber  MR-042-038 


Laboratory  of  Statistical  Research 
Department  of  Mathematics 
University  of  Washington 
Seattle,  Washington 


1 2 

The  Estimation  of  Biological  Populations  * 

By  Douglas  G.  Chapman 
University  of  Washington 

Summery . A number  of  statistical  models,  underlying  the  methods  used  in 
the  estimation  of  the  sizes  and  other  parameters  of  animal  populations, 
are  set  up.  The  relevant  estimation  equations  are  given,  with  their  vari- 
ances and  covariances.  For  the  most  part  the  theory  is  designed  for  3-arge 
populations.  In  setting  up  the  models,  consideration  has  been  given  "jo 
the  desideratum  of  having  them  confoim  as  closely  as  possible  to  the 
actual  practices  of  *"<*"*!  sampling.  To  vhat  extent  the  models  do  agree 
with  reality  is  one  of  the  many  open  questions  which  are  noted  in  this 
paper. 

1.  Introduction.  The  use  of  sampling  methods  in  the  enumeration  of  popu- 
lations has  become  widely  known  and  widely  accepted  only  within  the 
pest  generation.  Yet  it  is  easily  perceived  that  total  enumeration 
methods  fail  for  all  but  the  simplest  of  populations.  Particularly  is 
this  true  of  biological  populations  which  may  be  mobile  in  space, 
transient  in  time  and  difficult  of  access.  The  changes  in  space  (immi- 
gration and  emigration)  and  in  time  (recruitment  and  mortality)  must 
often  be  evaluated  to  determine  the  total  population  size  and  in  any 
case  these  changes  are  usually  of  interest  in  their  own  right. 


Presented  as  a special  invited  address  at  the  joint  meeting  of  the 
Institute  of  Mathematical  Statistics  and  the  Biometric  Society  (WNAI.) 
at  Stanford,  California,  June  19,  1953* 

2 

Work  done  under  the  sponsorship  of  the  Office  of  Naval  Research. 


2 


In  this  survey,  only  those  methods  are  considered  for  vhich  it  is 
possible  to  set  up  a reasonable  statistical  model  and  for  which  it 
possible  to  assess  the  sampling  errors.  Attention  is  limited  to  methods 
tiiat  lead  to  absolute  rather  than  relative  estimates.  Little  work  ;-as 
been  done  to  set  up  statistical  models,  as  a basis  of  relative  esti- 
mates, though  for  an  important  exception,  attention  is  called  to  a 
paper  of  Neynan  [22].  To  give  unity  to  this  survey,  only  those  methods 
that  have  been  used  in  the  study  of  macroscopic  mobile  populations  are 
discussed. 

Fixed  sample  methods  have  been  used  for  the  most  part  in  the  enu- 
meration of  other  biological  populations.  However,  even  the  enumeration 
of  sessile  populations  can  give  rise  to  new  statistical  problems; 
many  of  these  are  noted  in  an  Important  study  of  statistical  problems 
in  ecology,  that  recently  has  been  initiated  by  Skeilan  [28],  A further 
reference  in  this  field  is  to  a paper  by  Hoel  [13]. 

2.  Tag-and-sample  estimates;  direct  random  sampling.  "When  the  population 
structure  is  undefined  and  unknown,  it  is  not  possible  to  select  a 
fixed  sample,  as  is  the  case  say  in  ecology  or  in  sampling  human  popu- 
lations. The  idea  of  using  an  associated  variable  of  known  distribu- 
tion to  build  up  a sample  count  into  a total  population  was  first 
proposed  by  Laplace  [17].  He  suggested  determining  the  population  of 
France  from  the  known  number  of  births  in  all  parishes  and  from  the 
fact  that  the  ratio  of  births  to  total  population  could  be  determined 
for  some  parishes. 

Petersen  [23],  a Danish  biologist,  first  developed  the  procedure 
of  narking  fish  to  assist  in  studying  taeir  movemsnts,  migration,  etc. 


t. 


3 

He  later  came  to  realize  that  the  narked  fish  could  play  the  same  role 
for  his  populations  as  the  births  did  for  Laplace — though  evidently  he 
was  unaware  of  Laplace's  work. 

When  a mathematical  model  is  Bet  up  to  formalize  this  intuitive 
approach,  it  is  usual  to  assume  random  sampling  [i.e. , sampling  such 
that  the  properties  "being  tagged"  and  "being  sampled"  are  independent^ . 
It  is  much  easier  to  make  tids  assumption  than  to  verify  it.  It  is 
also  standard  to  assume  that  the  numbers  tagged  and  the  numbers  sanded 
are  parameters  at  the  disposal  of  the  experimenter. 

A completely  adequate  model  must  take  into  account  the  birth  rate 
with  possible  lag  effects,  a changing  death  rate,  as  well  as  emigration 
and  immigration  over  the  period,  during  which  repeated  tagging  and 
sampling  take  place.  It  is  apparent  that  the  number  of  unknown  para- 
meters Is  large  and  that  such  a model  must  be  indeed  complex.  Some 
simplifying  assumptions  are  desirable. 

The  following  model  is  not  the  most  general  possible;  it  does, 
however,  cover  many  of  the  situations  that  have  been  studied  and  it 
leads  to  simple  estimation  procedures.  It  applies  specifically  to 
large  populations  and  it  is  further  assumed  that  either  there  are  no 
recruits  to  the  population  (through  birth  or  immigration)  or  that  new 
recruits  are  distinguishable  and  may  be  eliminated  from  the  samples. 


total  population  size  at  tine  zero 

probability  that  an  animal  alive  at  time  t 
survives  and  remains  in  the  population  at 
time  t +-1 


Model  I 


Unknown 

parameters 


( N - 
o 

P =■ 
V 


Known 

parameters 


Random 

variables 


f\i 

Tu 

< 


f(i) 


number  of  animals  tagged  at  the  i "'*1  taggi :g 
taking  place  at  time  (1*1,2, 3- . .m) 

number  of  animals  sampled  in  the  sample 
taken  at  time  b^  (j  =1,2,3. . .r) 

number  of  animals  originally  tagged  at  th? 
tagging  and  recovered  in  the  sample 

number  of  animals  originally  tagged  at  th? 

tagging  available  for  recovery  at  the 
time  of  the  sample 

AU 

population  size  at  the  time  of  the  j sample 


the  smallest  value  of  j such  that  animals 
tagged  at  the  tagging  have  a positive 
probability  of  being  recovered  in  the  j ^ 
sample 


The  event  of  survival  is  assumed  to  be  independent  from  animal  to 


animal.  For  large  N it  may  be  assumed  that  given  Z , N.,  (which  are 

O lj  J 

not  observable  r.v.)  x^  has  a conditional  Poisson  distribution  witn 
expectation  (n^  TT^)  It  then  follows  that,  for  large  N^,  x^. 

has  approximately  a Poisson  distribution  with 

r.n,  p“ai 

£<0  = 


(1) 


i y 


N 


More  precisely  this  holds  as  a limiting  result  as  ->  oo  in  such  a 
way  that  |p  is  finite  (>0)  but  for  k > 1.  The  result 

is  obtained  by  working  with  the  conditional  m.g.f.  in  a standard  navner. 

With  this  approximation  it  is  straightforward  to  set  up  the  maxi- 
mum likelihood  equations  for  and  P,  viz. 


5 


(2) 


1l  = 
o 


H r tn  P 


✓\  -a. 


i"l  J=f(D 


where  the  dot  subs crip ta  denote  the  conventional  sum  nation  notation. 
Equation  (3)  is  a polynomial  in  P that  can  be  solved  by  the  usual 
methods. 


a 

The  inverse  of  the  asymptotic  variance-covariance  matrix  of  N and 

A 

P is 


serrations  (x^)  of  such  a census  in  a triangular  array — the  so-called 

"trellis  diagram"  used  by  Dovde swell,  Fisher  and  Ford  [loj  but  much 

more  thoroughly  studied  by  Leslie  and  Chitty  [19]  and  by  Leslie  [is]. 

Model  I departs  primarily  from  that  proposed  by  Leslie  and  Chitty  in 

ignoring  multiple  recaptures.  Leslie  and  Chitty  show  thiB  represents 

a loss  of  information;  for  large  N , however,  the  expected  number  of 

o 

multiple  recaptures  is  very  small.  In  fact  if  this  is  not  so,  it 


suggests  that  the  stochastic  variation  of  T and  N may  no  longei  be 

— J j 

negligible.  Moreover  the  multiple  recaptures  are  often  those  most 


suspect  from  the  point  of  view  of  randomness  of  the  sample. 

Leslie  and  Chitty,  in  common  with  other  investigators,  assumed 
that  mortality  and  emigration  are  strictly  deterministic.  Thus  they 
ara  able  to  write  down  the  expected  values  of  the  various  classes  of 
tag  recoveries  as  polynomials  in  P,  and  to  assume  a multinomial  distri- 
bution for  these  tag  recoveries.  The  maximum  likelihood  equations  can 
then  be  formulated,  though  the  solution  of  the  equations  can,  in  general, 
be  accomplished  only  by  iterative  methods.  They  have  studied  a large 
number  of  problems  in  this  manner  and  reference  should  bo  made  to  their 
papers  for  models  appropriate  to  situations  not  considered  here.  A 
model  based  on  the  Poisson  distribution  can  also  be  set  up  for  most  of 
these  situations,  which  will  be  valid  for  large  even  though  space 
and  time  variations  are  stochastic  variables,  and  which  will  often  lead 
to  simpler  estimation  equations.  A complete  treatment,  considering 
this  stochastic  variation,  has  not  been  given  for  the  case  of  or 

moderate  sized  populations. 


The  formulae  given  above  easily  specialize  to  Jackson’s  "negative" 
census  M.  (one  in  which  several  taggings  are  followed  by  a single 
sample,  at  which  time  only,  are  tag  recoveries  noted) . Bailey  £l] 
has  given  the  maximum  likelihood  estimates  and  their  asymptotic  variance- 
covariance  matrices  for  Jackson’s  various  census  schemes  assuming  deter- 
ministic birth  and  death  rates.  Jackson  also  set  up  a "positive"  census 
scheme,  which  he  used  to  estimate  the  rate  of  recruitment. 


7 


1 

By  defining  a parameter  B,  as  the  orobability  that  an  indivi.dt.al 
alive  at  time  t adds  a new  individual  to  the  population  by  time  t+1, 
and  assuming  that  this  event  is  independent  of  the  event  of  survival, 
the  model  outlined  above  may  be  extended  and  the  restrictions  of  no 
recruitment  may  be  removed.  The  x^  still  have  a Poisson  distribution, 
to  the  same  approximation  as  before  and  the  maximum  likelihood  equa- 
tions for  Nq,  P and  B are  easily  written  down.  The  two  equations  in- 

A A A /\ 

volving  P and  B are  polynomials  jointly  in  P and  B.  However,  it  seems 
hardly  realistic  to  assume  that  the  recruitment  rate  is  proportional  to 
the  population  size  or  that  it  is  independent  of  survival.  Another 
approach  is  noted  later. 

Another  specialization  of  formulae  (1)  to  (4)  is  to  put  P =1, 
i.e.  assume  mortality  can  be  neglected.  This  situation  is  familiar  to 
fishery  biologists  as  a Schnabel  type  census  named  for  the  person  who 
published  a mathematical  theory  of  estimates  based  on  such  a multiple 
census  [27  J.  More  precisely,  as  noted  by  the  author  in  (6j,  for  large 


is  approximately  unbiased  with  standard  deviation  given  by 


Also  confidence  limits  for  the  Poisson  parameter  will  yield  con- 
fidence Limits  for  N in  this  case — see  e.g.  [j+1. 

o 


8 


In  the  usual  Schnabel  census  tagging  is  carried  on  simultaneously 
with  the  sampling  process.  More  precisely  after  each  sample  is  exwnined, 
the  un tagged  individuals  are  lagged  and  then  all  are  returned  to  tie 
population.  If  this  is  strictly  followed 


i - 1 

E 

j=i 


and  hence  the  are  random  variables.  For  large  the  random  varia- 
tion of  the  may  be  neglected.  In  fact  it  has  usually  been  disre- 
garded in  any  caee. 

It  is  apparent  that  there  may  have  to  be  some  restriction  on  m 

and  r to  make  the  results  given  above  meaningful.  In  particular  if 

ra  * r •»  1,  no  estimation  of  the  parameters  and  P is  possible—but 

estimation  of  NQ  is  possible  if  a^  « 0.  This  is  the  Bimple  Petersen 

situation, — a single  tagging  followed  by  a single  Bample.  The  formulae 

in  tnis  case  are  seen  not  to  depend,  for  large  N , on  mortality  assump- 

o* 

tions.  While  the  variance  of  N (n  , the  almost  unbiased 

o xTl 


estimate  of  NQ,  which  is  given  by 


is  a function  of  P,  the  survival  factor,  for  most  practical  purposes 
this  my  be  disregarded. 


3.  Tag-and-sample  methods:  inverse  sampling.  A modification  of  the 

sampling  procedure  outlined  above  has  been  developed  by  Bailey  fl], 
Goodman  [l2]  and  the  author  [&].  If  the  number  of  tags  to  be  recovered, 
rather  than  the  sample  size,  is  predetermined,  estimates  are  obtained 


• BjRmr  * . *> 


,:*•*•**  'pf-  . - 


«r 


* 


9 


which  are  somewhat  simpler  and  slightly  more  efficient.  The  most  inter- 


esting of  these  results  is  that  due  to  Goodman,  who  considered  a ’ 

t 

4 

. multiple  sample  type  of  census  for  a situation  where  there  is  no  recruit-  j 


rent  and  P=1  (such  a population  will  hereafter  be  referred  to  as  ( losea) . 


His  prococure  is  sequential  in  that  the  decision  to  stop  saapling  is  a 
consequence  of  the  observations. 


Model  II 


Unknown 

parameter 


Known 

parameters 


) 

\ 


\ 

V 


x 


population  size 

the  predetermined  sequence  of  samples 

the  number  of  tagged  individuals  in  the  popu- 
lation at  the  time  the  i^  sample  is  taken 
(T.  cumulative  total  is  to  be  distinguished  from 
the  number  of  tags  out  out  in  the  iw 
tagging.  (i*=l,  2,  5 ....  ) 

the  predetermined  number  of  tagged  members  to 
te  recovered  before  the  sampling  experiment 
stops 


( r 

i 

Random 
variables 

v 


the  number  of  samples  taken  before  the  x 
tagged  individuals  are  recovered 


r 


Sampling  is  assumed  to  be  random  with  respect  to  tagged  and  untagged 


individuals.  Then  for  large  N 

o 

?r  (r  samples  are  required  to  obtain  x tagged  members) 

a=  Y~  Pr  fx-j  tags  are  recovered  in  first  r-1  samples] 
i-1 

• Pr  tags  are  recovered  in  the  r^1  sample^ 


%vt  '<****-+  ^ 


7 


io 


X 

r e 

J=1 


\T.  1 

7x-j)  i 


~ A. 


jV 

i ! 


r 

E/J. 

1=1  1 


e 


n,T, 


where  we  have  written  /( . for  -*•  1 . 


.1 


leaking  the  change  of  variable 
r 


u = 2 r.a. 

1-1  1 


Au  = 2/1 


P ’ (a  < v 

Let  X — » us 
o 

Using  Dunaael’  s . 

(11) 

lim  F: 

N *>0D 
0 

i.j. 

u»2  x: 

u 

2 


r~r 


/u\x-.l 

\V 


X-  . Aa 


lx-l)i 


1 

o(Au)  J 


00 


b u 
~ 2 


2Xlr\x) 


x-1 

e u du 


i i 2 

A.  bas  a lLaiting  distribution  with  2x  decrees 

i 1 


of  freedom. 


It  follows  that  the  (asymptotic)  minimum  variance  unbiased  estimate 


T 


(12: 


N 


o 


r 

V" 

i=fl 


x 


and 


1L 


The  proof  given  ab.ve  differs  from  that  of  Goodman:  he  considered 

the  Schnabel  type  of  census  where  tagging  and  sampling  are  performed 
in  the  auto  operation,  i.e.  all  untagged  individuals  are  tagged  before 

the  samp.'. 3 is  re  sumed  to  the  population.  What  he  showed,  namely 
_2  ^ / 2 

that  — . has,  asymptotically,  a /.  distribution  with  <bc  d.f.,  is 
N 

equivalent  to  the  above  result.  In  this  case  it  is  simple  to  find  the 
average  simple  size,  i.e.  £(2),  (for  large  NQ) . For  these  results 
and  other  exact  sample  results  reference  is  made  to  Goodman4 * * 7 b paper 
cited,  [l2] . 

The  simplicity  of  cr*  may  make  it  particularly  useful  in  deaign- 

”0 

ing  tne  sample  census.  Up  to  the  moment,  however,  the  several  inverse 
sampling  schemes  proposed,  have  not  been  tried  out.  How  to  choose  the 
sequence  .■  ni|  in  an  optimum  maimer  remains  an  open  question.  Nor  has 
any  attempt  t>eer  made  to  set  up  a theory  of  inverse  sampling  for  other 
than  closed  populations. 


4.  Tag-and- janplo  estimates;  regression  approach.  The  assumptions  under- 

lying Modal  I may  fail  for  a variety  of  reasons — imperfect  sampling, 
clustering  of  the  populations,  variation  over  the  populations  and  over 

time,  of  the  mortality  (or  emigration)  rate,  etc.  In  view  of  the  con- 
siderable superimposed  variability  that  nay  thus  exist,  in  addition  to 
strictly  multinomial  (or  Poisson)  variation,  it  is  pertinent  to  ask 

whether  1 linear  regression  model  might  not  be  more  appropriate. 


12 

Model  III.  The  same  notation  as  Model  I is  needed.  However,  the 
restriction  that  there  be  no  recruitment  may  be  removed.  Hence  it  is 
moire  reasonable  to  regard  N^,  N,,,  ...  as  unknown  parameters  to  be 
estimated.  Furthermore  the  definition  of  P cam  be  extended  as  follows: 
P =•  the  average  probability  that  an  individual  alive  at  time  t 
survives  and  remains  within  the  population  to  tine  tv  1. 

If  the  sampling  is  such  that 


(13) 

it  follows  that 
(U) 


e<xu  IV  = 


£<y 


Vj  r 


bJ-«l 


The  regression  approach  might  be  baaed  upon  the  assumption  that 


(15) 


& 


. xij^ 


(ai-bj)  In  P In  N j 


and  that  In  (x^  -#-1)  has  a constant  variance  (approximately) . 

The  factor  (x^-t-1)  is  suggested  by  the  fact  that  the  reciprocal 
of  a binomial  or  Poisson  r.v.  plus  one  is  an  (almost)  unbiased  estimate 
of  the  reciprocal  of  the  parameter.  Moreover  such  a device  avoids  the 
difficulties  of  occasional  seros — care  must  be  exercised  if  the  zeros 
are  numerous  or  in  sequence,  for  the  assumption  above  may  then  be 
clearly  invalid.  The  logarithmic  transformation  is  suggested  by  the 
product  nature  of  equation  (14) . However,  it  is  also  true  that  the 
variance  of  the  logarithm  of  a variable  that  is  distributed  according 
to  the  Poisson  law  is  constant  up  to  terms  of  order  y?  “1.  Furthermore, 


1 


the  logarithm c transfomation  has  been  extensively  used  in  analysing 
data  obtained  from  pelagic  hauls  or  catches  (cf.  e.g.  Windsor  and 
Clark  [30]). 

Best  linear  unbiased  estimates  of  In  P and  In  R are  found  by  the 
least  squares  method  (under  these  assumptions).  From  these,  estimates 
of  P and  Nj  are  obtainable  which  have  optimum  asymptotic  properties 
though  not  necessarily  optimum  small  sample  properties.  Interval 
estimates  may  also  be  obtained  by  postulating  approximate  normality 
of  the  In  (xjjf’l).  Such  interval  estimates  may  be  much  more  realistic 
than  those  based  on  Model  I,  if  there  is  in  fact  superimposed  variabil- 
ity due  to  the  causes  indicated  or  to  other  causes. 

Model  III  represents  in  a sense  an  omnibus  model.  It  has  the  ad- 
vantage that  an  estimate  of  the  extraneous  variation  can  be  made  from 
the  observations.  On  the  other  hand,  it  is  imprecise  and  heuristic 
rather  than  rigorous.  If  the  heterogeneities  noted  can  be  carefully 
assayed,  if  not  controlled,  it  may  be  possible  to  set  up  a model  which 
has  this  advantage  and  is  at  the  same  time  more  exact. 

This  type  of  approach  would  give  some  flexibility  to  the  assump- 
tions underlying  Jackson's  positive  census  (where  a single  tagging  is 
followed  by  a sequence  of  samples)  or  more  generally  to  the  "trellis 
diagram*  census  scheme  where  recruitment  is  to  be  taken  into  account 
by  a single  parameter,  hedefining  B as  the  average  probability  that  an 
individual  within  the  population  at  time  t adds  a new  recruit  to  the 

population  at  time  t-f~l,  similar  assumptions  as  those  above  lead  to 

b a, 

N (P*-B)  3 
o 


u 


Hence  eel. au-  tee  of  3 ",r  :l  II  could  be  derived  from.  "*ie  .east  suo*  res 

o 

estimates  of  lx  P,  In  (P -f~  U)  and  in  II  free-  the  equation: 


(17)  £ (in  j = (Vbj)  p j"  ln  No  r"bj  ln  <•*  • 


5.  Dichotonay  methods.  A nethoc  of  estimating  population  size,  tint  hr  a 
been  used  in  wildlife  research,  and  which  may  be  useful  in  owner 
fields  is  based  on  the  change  of  sex  ratio  caused  by  r.  selective  kill. 
The  sex  ratio  is  determined  before  and  after  th  a kill  by  sampling 
methods.  Several  references  to  field  applications  of  the  method  are 
listed  by  Scattergood  .153  In  a general  survey  of  methods  of  popula- 
tion estimation. 

The  estimation  procedure  nay  be  based  upon  any  dichotomy  within 

I 

the  population,  or  even  on  external  factors:  ail  that  is  required  is 

a sampling  process  followed  by  a selective  removal  of  individuals  from 
the  population,  and  subsequently  a further  sampling  process.  Closed 
populations  only  will  be  considered. 


Model  IV 


Unknown 

parameters 


N..  — population  at  tire  ( i --  1,  2) 


/ N-  - fop 
J j.ade  up  of 


two  c Lasses  X and  I 


^X^,Y^  s size  of  clatses  X,Y  tirois  t^ 


Known 

parameters 


n^  sz  size  of  random  samples  taker  at  time  t^ 


r*  = W ry=W  '“Vt 


Random 

variables 


( x.  =s  number  cf  elements  of  class  X in  sample  n, 

< (i»l.S) 

| y =■  number  of  e] enonte  of  class  Y in  sample  n 

\ 1 i 


15 


i * I jf 


A.5  suiting  s;unpli&g  w.i  t,h  rot-luce!:  eat 


US'  -(-  ()  xi ' "lA  fB*Vy*f* A . AA** 

Since  xt  is  assumed  that  1^,,  are  expressible  in  terns  of  X-  and 
and  known  pe raceters,  ertinates  and  are  easily  found,  The 

□onor  t estiraat* 


(19) 


n2W2 


(20) 


N 


1 - 


wS 

are  also  naxlmun  likelihood  estimates. 

Toraulae  for  t^^l^pfoti^variance-covarlance  matrix  are  as 
follows: 


/A. 

. +■  A. 

r-  n _ 

A_ 2 

/ V, 

V2 

"A  "A 

(21) 

l 

1 

\_A. 

A. 

fi  h ^ i 

V"A 

¥2 

w 2 v h 2 V 

1 1 2 2 

so  that 

(22) 

2 

(asymptotic) 

1 *2X2  ^ 2 
n2  i 

. 2 

I 2 


where  F,  =*  -i 
i Nt 


<i-  1,  2)- 


These  formulae  may  be  utod  to  d.otenalne  the  optimum  theoretical 
allocation  of  sampling  between  the  before  and  after  samples.  It  is 


• 


also  interesting  to  use  them  to  compare  the  effort  required  for  thi 3 
type  of  census  with  that  required  for  tag  sample  methods.  A numerical 
study  shows  that  the  teg  sample  method  has  the  advantage— assuming 
that  the  tags  are  Baapled  by  the  removal  process.  However,  the  evalu- 
ation is  incomplete  without  some  means  of  determining  the  relative 
costs  of  sampling  and  tagging.  Moreover  it  is  reasonable  to  suppose 
that  the  assumptions  underlying  the  dichotomy  method  are  more  likely 
to  be  fulfilled  than  in  the  tag-sample  method— questions  of  tag  morial- 
ity  and  differential  recapture  rates  do  not  arise. 

In  some  situations  it  may  be  possible  to  sample  two  populations, 
e.g.  a sport  fish  and  a scrap  fish.  The  sports  fishery  then  serves 

*s  the  Mlectire  r“0™1  r40t°r  * ™ry  r»TOrable  sltu“Uo“  sln“  ry 
will  be  zero.  In  this  case  X is  the  parameter  that  it  is  of  interest 

to  estimate. 

The  method  may  also  be  applied  where  the  removal  is  done  by  ths 
sampler.  In  this  case  it  is  more  realistic  to  assume  a succession  of 
samples  are  taken.  Again  it  is  straightforward  to  set  up  the  model  for 
this  situation  and  to  derive  the  maximum  likelihood  equations  for  X and 
N.  This  naturally  suggests  a sequential  estimation  procedure  where  the 
decision  to  stop  is  determined  by  the  sample  results. 

If  there  is  dilution  or  elimination,  the  procedure  is  obviously 
vitiated.  As  yet  no  work  has  been  done  to  extend  the  method  to  esti- 
mate these  factors^  Estimates  of  mortality  for  example  might  be  based 
on  a trichotomy  or  on  an  intermediate  sampling  during  the  removal 
process.  The  several  ssmple  scheme  (sequential  or  not)  would  lend  it- 


ioated  situation. 


17 


6.  Methods  based  on  the  potion  of  effort.  That  the  amount  of  effort  ex- 
pended in  obtaining  a given  sample  of  a population  is  proportional  to 
the  population  density  has  long  been  the  basis  of  relative  population 
estimates.  Leslie  and  Davis  [20 "]  and  independently  DeLury  [ 7 ] shoved 
hov  absolute  estimates  could  be  determined  from  this  information,  when 
the  successive  samples  are  removed  from  the  population— as  for  example 
occurs  in  the  catch  of  a fishery.  Except  for  this  catch,  the  popula- 
tion is  assumed  closed.  1 model  similar  to  DeLury’ s is  as  follows t 


Model  7 


Unknown 

parameters 


Known 

parameter 

Random 

variable 


N0  * initial  population  site 

k — average  probability  that  an  individual  is 
captured  by  one  unit  of  effort  in  any  time 
interval 

K.  = total  catch  up  to  but  not  including  the 

* t^inUrwl 

C = catch  per  unit  of  effort  during  the  t^1 

* interval 

(t  =1,  2,  ...  m) 


If  the  unite  of  effort  are  independent  it  follows  that 
f(Ct)  - k(»0  - Kt> 

2 

With  the  further  assumption  that  is  approximately  constant  (which 

is  reasonable  for  large  HQ  unless  the  cumulative  catch  represents  a 
large  segment  of  the  population  by  the  end  of  the  experiment),  least 
squares  estimates  of  k and  may  be  found.  In  particular 


C±  (Kt-K)2 

(23)  «0* * * 

£ Ct(K 
t*"!  % 1 


18 


If  the  further  assumption  of  approximate  normality  of  the  Is 
made,  confidence  Intervals  for  N are 


(24) 


h-rS*,<h-K 


where  are  the  roots  of  the  equation, 

(25)  y2/[z  ct0ct-K>]2  - s2  Z - a [Z(\-K)2][zct(itt-^)j  y 

+ (*  - (z  (^)2j  2 = o 
2 

where  q=t  (m-2)  * Sr  and  Sc  is  the  estimated  error  variance 

i-f 


of  Ct. 

The  confidence  Intervals  are  obtained  by  the  Fieller  technique  [29], 
and  hence  there  is  a non-zero  probability  that  the  method  will  give 
interval  estimates  of  the  form  (0,  00)  or  (0,  N^) (N^  as).  Also  it 
should  be  observed  that  the  model  is  essentially  a conditional  one, 

i.e.  conditional  upon  the  values  K,K„...K  . 

12  m 

DeLury  has  also  considered  the  possibility  of  weighting  the  least 
squares  estimates,  though  he  suggests  that  such  a procedure  may  be 
meaningless  if  the  sampling  is  not  random.  This  is  very  likely  the 
case  in  utilizing  commercial  or  sports  catch  record  or  in  sampling 
schooling  populations  for  example.  For  a further  discussion  of  these 
points  and  of  the  method  in  general  reference  is  made  to  [7]  and  [8]. 

For  the  case  where  the  effort  is  constant,  Moran  [2l]  has  99t  down 
a model  based  on  the  assumption  of  random  sampling.  The  model  may 
easily  be  extended  to  the  case  where  the  effort  varies  from  period  to 


period.  A soaeviiat  rr.oie  interesting  extension  is  based  cn  a coiaoins- 
tion  of  tag  end  sample  and  catch  per  unit  of  effort  .-.ethcds.  The  case 
of  a closed  po^jia tion  is  still  considered. 


Hence 


20 


Hence 

Pr  (n1#n2»*,,V  *i*V***V 


(28) 


fr 

i=i 


[(N-H,)  /n^Yi  _ 

nL  I \N-X,  / 


The  maximum  likelihood  equations  for  k and  are 

f 

a 2-  »« 

(29)  /l  = iasl 


£ e,(N-E. ) 
i=l  1 1 


(30) 


°rxi 

N-K, 


r r 

H 21 

1=1  1=1 

1=1  1 


The  inverse  of  the  variance-covariance  matrix  of  k and  NQ,  ex- 
pressed in  terns  of  the  is: 


7.  further  Problems.  Each  of  the  models  set  up  and  others  that  have  been 
considered  involves  one  or  more  assumptions  which  it  is  difficult  or 
impossible  to  verify  directly.  For  example  underlying  the  tag-and- 
s am pie  models  there  is  the  assumption  that  tagged  members  of  the  popu- 
lation  behave  similarly  to  the  untagged  members,  at  least  in  respect  to 
recapture.  A primary  assumption  of  the  methods  based  on  effort  is  that 
catchability  is  constant. 


'•rtntlpr  "**  '■  fVtJVf 


%•  -•.-*»  # 


;l 


IP 

M*. 


mi  nril  i--  i ' 

21 

Seme  empirical  studies  have  been  made  to  verify  the  estimates  of 
population:;  by  sampling  methods.  In  some  experiments  conducted  on 
fresh-vater  lakes  the  whole  population  has  been  poisoned  out  (a  pro- 
cedure that  can  hardly  be  recommended  as  an  enumeration  procedure 
except  where  the  elimination  of  the  existent  populations  has  been  the 
primary  aim) . The  agreement  has  been  satisfactory  for  some  species  but 
not  for  all — for  example  cf.  Car  lander  [ i]-  It  should  be  remarked  that 
sampling  methods  have  often  been  necessary  in  connection  with  the  esti- 
mates determined  from  the  dead  recoveries. 

Such  methods  of  verification  have  at  best  limited  application.  It 
is  necessary  to  design  sampling  experiment  specifically  for  this  purpose. 
In  this  connection  it  is  suggested  that  combinations  of  the  various 
methods  outlined  may  be  useful.  This  has  been  proposed  by  DeLury,  [9]: 
his  discussion  of  the  underlying  assumptions  of  sample  census  methods 
is  particularly  pertinent. 

Such  combinations,  of  which  Model  VI  is  an  example,  may  also  yield 
more  inforaation  than  the  application  of  a single  method.  Of  oourse  if 
the  sampling  is  being  done  by  a succession  of  commercial  catches. 

Model  VI  is  the  appropriate  one  rather  than  Model  I — though  the  hetero- 
geneities introduced  by  such  commercial  catch  sampling  may  suggest  a 
regression  model,  i.e.  an  extension  of  Model  III. 

In  Model  I the  numbers  tagged  and  sampled  were  regarded  as  para- 
meters; in  actual  fact  they  may  also  be  random  variables.  For  example 
if  the  sampling  is  done  by  a commercial  catch,  the  proportion  of  the 
catch  from  the  population  to  be  estimated  may  be  determined  only  by 
sampling  experiments.  This  situation  complicates  the  interval  estimates 


22 


and  while  a crude  determination  of  a confidence  interval  for  N is 

o 

possible  by  a sequence  procedure,  this  patently  wastes  information.  The 
several  variations  of  this  situation  that  may  arise  suggest  the  necessity 
of  a study  of  confidence  intervals  in  connection  with  compound  distribu- 
tions. 

Referring  again  to  Model  I,  it  may  be  recognized  from  the  outset 
that  heterogeneity  exists  within  the  sampling  procedure.  If  it  is  pos- 
sible to  subdivide  the  tagging  and  sampling  into  periods  (by  time  or 
area  for  example),  within  each  of  which  random  sampling  may  be  assumed 
then  it  is  possible  to  obtain  consistent  estimates,  though  the  interval 
estimation  problem  is  unsolved.  This  situation  was  first  considered  by 
Schaeffer  [2f>]. 

An  obvious  extension  of  Model  V is  to  assume  that  the  probability 
of  capture,  rather  than  being  constant  over  the  population,  1b  itself  a 
random  variable.  The  distribution  of  the  probability  of  capture  nay  per- 
haps be  related  to  the  expected  catch  in  any  tine  interval.  Additional 
Information  is  available  if  different  methoda  of  capture  are  being  used 
simultaneously — in  fact  in  this  cast  tbs  restriction  that  the  population 
ie  closed  may  be  relaxed  and  an  estimation  procedure  set  up  for  the  pop- 
ulation size  at  each  time  interval. ' 

As  has  bean  inferred,  the  interval  estimation  problem  remains  un- 
solved for  many  of  the  models,  except  for  the  large  sample  results. 
Correspondingly,  the  sample  theory  of  tests  in  connection  with  such 
models  has  been  given  almost  no  attention.  Some  simple  applications  of 
the  XZ  test  have  been  given  by  Leslie  [is]  and  by  the  author  [5].  As 
more  intricate  experiments  are  designed  and  more  careful  control  plans 


23 

undertaken  it  will  be  necessary  to  consider  tests  for  recruitment  and 
mortality  rates,  for  example. 

The  complexities  of  estimating  the  birth,  death,  emigration  and 
immigration  rates  indicate  that  it  will  be  necessary  to  set  up  special 
experiments  to  adequately  determine  these  factors.  Some  of  the  experi- 
ments set  up  by  Jackson  [15]  where  marking  and  recovery  were  carried 
on  in  a series  of  adjacent  areas  were  designed  for  this  purpose,  hand  can 
walk  theory  has  been  applied  in  one  special  situation  by  Gilmour,  Water- 
house  and  McIntyre  [ll ].  The  study  of  birth  and  death  processes,  and 
of  processes  associated  with  random  as  well  as  migratory  movement,  is 
necessarily  associated  with  the  population  estimation  problem  and  the 
latter  will  be  completely  eolved  only  when  the  problems  associated  with 
these  stochastic  processes  are  resolved. 


24 


REFERENCES 

A.  Population  Estimation 


[l]  Norman  T.  J.  bailey,  "On  estimating  the  size  of  mobile  popula- 
tions from  recapture  data"  , Biometrika  38  (1951)  pp.  293-306. 

£2}  "Improvements  in  the  interpretation  of  recapture  data". 

Journal  o£  Animul  Ecology  21  (1952)  pp.  120-127. 

[3]  Kenneth  D.  Carlander  and  William  M.  Levis,  "Some  precautions  in 
estimating  fish  populations",  Prog.  Fish.  Cult.  10  (1948) 


pp.  134-137. 

[4]  Douglas  G.  Chapman,  "A  mathematical  study  of  confidence  limits  of 
salmon  populations  calculated  from  sample  tag  ratios", 
Internet.  Pacific  Salmon  Fisheries  Conn.  Bulletin  No.  2 (1948) 


pp.  69-85. 

"Some  properties  of  the  hypergeometric  distribution  with 
applications  to  zoological  sample  censuses",  Unlv.  California 


[6] 

M 

M 

M 


?ubl.  Stat.  1 (1951)  pp.  131-160. 

"Inverse,  multiple  and  sequential  sample  censuses", 

B lone tries  8 (1952)  pp.  286-306.  ‘ ~ 

D.  B.  DeLury,  "On  the  estimation  of  biological  populations", 
Biometrics  3 (1947)  pp.  145-167. 

"On  the  planning  of  experiments  for  the  estimation  of  fish 

populations".  Jour.  Fish.  Res.  Bd.  Can.  8 (1951)  pp.  281-307. 

"On  the  assumptions  underlying  estimates  of  mobile  popula- 
tions", BiostatiBtics  Conference,  Amee,  Iowa,  1952  (to  be 
published) . 


25 

/10]  V.  H.  Dovdesvell,  R.  A.  Fiaher  and  E.  B.  Ford,  "The  quantitative 

study  ol'  population  in  the  Lepidoptera.  I.  Polyoaaatus  Icarus 
Rott."  Ann.  Eu.zen. . Lond. . 10  (1950)  pp.  123-136. 

[ll]  D.  Gilmour,  D.  F.  Waterhouse  and  G.  A.  McIntyre,  "An  account  of 
experiments  undertaken  to  determine  the  natural  population 
density  of  the  sheep  blowfly  Lucilla  cuprlna  Wied".  Bull.  Aust. 
Council  Scl.  Ind.  hey,  no.  195. 

f 12 } Leo  A.  Goodman,  "Sequential  sampling  tagging  for  population  size 
problems",  Ann.  Math.  Statistics  24  (1953)  56-69. 

[l33  Paul  G.  Hoel,  "The  accuracy  of  sampling  methods  in  ecology",  Ann. 

Math.  Statistics  U (1943)  pp.  289-300. 

[14]  C.  H.  M.  Jackson,  "Sane  new  methods  in  the  study  ol*  Glosslna  uorsltans. 

Proc.  Zool.  Soc.  London  4 (1936)  pp.  811-896. 

£l5j  — — ■ "The  analysis  of  an  animal  population",  £.  looloer 

3 (1939)  pp-  236-246- 

£16}  "The  analysis  of  a taetae-fly  population",  Ann.  Eugen. . 

Lond..  10  (1940)  pp.  332-369. 

£l7j  P.  S.  Laplace,  "Sur  lea  naissances,  las  marriages  et  les  raorts", 
Uistolre  de  V Academic  Royals  das  Sciences  Annee  1783  Paris 
p.  693  (actually  published  in  1786). 

[18]  P.  H.  Lealie,  "The  eetimation  of  population  parameters  from  data 
obtained  By  means  of  the  capture-recapture  method,  II  The 
estimation  of  total  numbers",  Biometrtka,  39  (1952)  pp.  363-388. 
£l93  P.  H.  Leslie  and  Dennis  Chi tty,  "The  estimation  of  population  para- 
meters from  data  obtained  by  means  of  the  capture-recapture 
method,  I The  maximum  likelihood  equations  for  estimating  the 

death-rate",  BlouetrUca,  38  (1951)  pp.  269-292. 


A 

1 


L 


[20]  P.  H.  Leslie  and  D.  H.  S.  Davie,  "An  attempt  to  deteroine  the 

absolute  number  of  rats  on  a given  area",  J.  Animal  Ecology  8 
(1939)  pp.  94-113. 

[21]  P.  A*  P.  Moran,  "A  mathematical  theory  of  animal  trapping", 

Blometrika.  38  (1951)  pp.  307-311. 

[21a]  — — "The  estimation  of  death  rates  from  capture-mark-recapture 

sampling",  Bla.ietrlka  39  (1952)  pp.  181-186. 

[; 22* ] J.  Neynan,  "On  the  problem  of  estimating  the  number  of  schools  of 
fish",  Univ.  California  Publ.  Stat.  1 (1949)  pp.  21-36. 

[23]  C.  G.  J.  Petersen,  "The  yearly  immigration  of  young  plaice  into  the 
Llmfjord  from  the  German  Sea,  etc.".  Kept.  Danish  Biol.  Sta. 
for  1895,  6 (1896)  pp.  1-48,  (cf.  also  later  reports). 

[u]  William  E.  Ricker,  "Methods  of  vital  8tftti.8tic8_of 

flah  populations",  Indiana  Uni*.  Publ.  Science  Series  Mo.  15 
(194 8). 

[25]  Leslie  V.  Scattergood,  "Estimating  fish  and  wildlife  populations: 
a survey  of  methods",  Biostatlcs  Conference,  Ames,  Iowa,  1952 
(to  be  published). 

£26]  Milner  B.  Schaeffer,  "Estimation  of  the  else  of  animal  populations 
by  marking  experiments",  Jk.  S^  Fish  and  Wildlife  Ser.  Fish. 
Bull.  60  (1951)  PP.  191-203. 

[27]  Z.  E.  Schnabel,  "Estimation  of  the  total  fish  population  ora 
lake",  Amer.  Math.  Monthly  45  (1938)  pp.  348-352. 

£28 J J.  G.  Skellam,  "Studies  in  statistical  ecology.  I Spatial 
pattern",  Biometrika.  39  (1952)  pp.  346-362. 


27 


E.  C.  Fieller,  "The  biological  standardization  of  insulin",  Sudd, 
Jour.  Royal  Stat.  Soc.  7 (194  0)  pp.  1-65. 

C.  P.  Vinsor  and  G.  L.  Clarke,  "A  statistical  study  of  variation 


in  the  catch  ol*  plankton  nets" , Jour.  Marine  Research  3 
<1%0)  pp.  1-34. 


Laboratory  of  Statistics 
Depart  tent  of  Mathematics 
University  of  Washington 

Technical  Reports  Distribution  List 

Contract  N8  o nr- 520  Task  II 


Chief  of  Naval  Research 
Office  of  Naval  Research 

"tw'.hiivion  '■  5,  D.  C. 

/■  •;  to;  <' -d  • 432 

tiCS  tir.’IC'l/ 


» • i ' i ‘ • : ■■  > it 

. J ' h«t  . rr,. 

• * ; ne  Nr*  - 

• • • l.  c : 

i ■ 1 ' -■'ion 
; ; '•  . he®, 

t • ne  N'.v' 

t • *-t 

• ) Ca ■ < ? * 

• iikfcv  ! 

i r . fteswaivh 

i . • • t ' 

■ f i - , - ! ■ t D»  C • 

■’  •.  i..  lt<:  < ifcd  Infer  inh 

.ill: er  9 

iff  ice  <>f  uie  Assistant  Na-'iii 
attache  for  Research 
fkivai  Attache 
American  Embassy 
Navy  No.  LX> 

Heet  Pont  Office 

New  York,  Y.  2 

Headquarters,  USA}' 

Director  of  Research  and 
Development 

Washington  25,  D.  C.  1 

Director 

Office  of  Naval  Research 
Branch  Office 
844  North  Rush  Street 
Chicego  11,  Illinois 


Director 

Office  of  Naval  Research 
Branch  Office 
346  Broadway 

New  York  U,  N.  '(.  i 

Director 

Office  >5  jes-eHrch 

Branch  0 fli-H 

1 — JO  Cmj  -i  ’ li'trt  - . Li  et  t 

Pas® i.,  s'  1 ! i i'ji nia  1 

,.ai  r If.:. 

fesescc?'  iw  .jt'V.rL  p ent  d. .aid 
he  ? : 

*i.  shiii(,t  •.  i ),  D t,  1 

f omnuoiccv 

U.D.  Nsvri  vi'inaocf  lest  l ration 
inyoi-ern,  -hna  i^nke,  CaLif.  1 

Ci  lef  uf  Nava  1 J.>ei at  ions 

Opereti.ui  ivaiiatirn  Group-OP  374 
The  Pentagon 

Washington  25,  D.  t.  1 

Office  of  Naval  Research 
Department  of  the  Navy 
Washington  25,  D.  C. 

Attn:  Code  438 

(Meehan  is  a branch)  2 

Professor  Carl  B.  Allendoerfer 
Department  of  Mathematics 
University  of  Washington 
Seattle  5,  Washington  1 

National  Bureau  of  Standards 
Institute  for  Numerical  Analysis 
405  Hillard  Avenue 
Los  Angeles  24,  California  2 


1 


Professor  V.  Allen  Wallis 
Coonittee  on  Statistics 
University  of  Chioago 
Chicago  37,  Illinois  1 

Director,  Applied  Mathematics 
and  8tatistios  Laboratory 
Stanford  University 
Stanford,  Celifomla  3 

Chief,  Statistical  Lrrflneering 
laboratory 

Nation*..  It) refeii  of  Star oar  is 
Wajjhirvtc  u D.  C.  1 

t>NL  Loi\«,7  an.on 

l5o  Fourth  L i.raet 

. >.r,ta  Mori Cali  form*  i 

Profesc  J.  Neynai. 

Stati  i.i i<a>nrator> 

Usiverbi^y  >t  Oalifor^JA 
H-fTK*.  i*‘y , f.  iifornia  I; 

Aj-justaii'.  ..-Il'I’  of  -i-4 

for  iWe*  r i snd  Deve  l oprorr.  t 
« L * AlViy 

Vsshiitr’t'jn  2t»,  D.  C.  i 

Professor  Herbert  Sola-ion 
Teachers  Colleije 
Columbia  University 
Nee  York,  N.  Y. 


1 


